-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #483, #484 and above all handle errors happening in search stream leaves. #487
Conversation
db82dac
to
9c55514
Compare
d519bd1
to
f4c9d87
Compare
f4c9d87
to
c6287a7
Compare
c6287a7
to
0b8b095
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading this PR, I vote for changing SearchError::InternalError(String)
to SearchError::InternalError(anyhow::Error)
.
assert_eq!(metadata_file_exist, false); | ||
|
||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also test that the split file no longer exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think SearchError gets serialized/deserialized... anyhow cannot be serialized, hence the msg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like anyhow because it tends to do a better job at tracking the underlying error while we do a poor job at it. We could serialize the anyhow error as a string using the debug format, which displays the cause and a backtrace if one is present.
quickwit-search/src/client.rs
Outdated
result_sender.send(Ok(result)).map_err(|_| { | ||
SearchError::InternalError("Could not send leaf result".into()) | ||
})?; | ||
loop { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this logic hard to follow. results_stream
is a Streaming<T>
struct which implements Stream
. We can greatly simplify the whole thing:
let mut results_stream = grpc_client_clone
.leaf_search_stream(tonic_request)
.await
.map_err(|tonic_error| parse_grpc_error(&tonic_error))?
.into_inner()
.take_while(|grpc_result| futures::future::ready(grpc_result.is_ok()))
.map_err(|tonic_error| parse_grpc_error(&tonic_error));
while let Some(search_result) = results_stream.next().await {
result_sender.send(search_result).map_err(|_| {
SearchError::InternalError(
"Sender closed, could not send leaf result.".into(),
)
})?;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is actually going to swallow the last error :x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I want rust-lang/rust#62208. Anyway t_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the last element of the stream is None
so I think it should work, I really like your version, I did not think about the take_while
@@ -168,11 +168,13 @@ impl FastFieldCollectorBuilder { | |||
} | |||
|
|||
pub fn fast_field_to_warm(&self) -> Vec<String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had this return type be a Set
, we wouldn't have to do this. Is tag_fields
on IndexConfig
also a Vec
instead of a Set
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
Result::<(), SearchError>::Ok(()) | ||
}); | ||
} | ||
let request_clone = request.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could "own" that request and not clone it as you did in some other parts of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1... Generally speaking if ownership is required, it is better to reflect it in the prototype. It does not necessarily avoid the clone, but at least it might.
let mut stream = | ||
leaf_search_results_stream(request_clone, storage, splits, index_config).await; | ||
while let Some(item) = stream.next().await { | ||
if let Err(error) = result_sender.send(item) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my personal preference, but I like the map_err(...)?
pattern better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the if let Err
here.
The body here is just doing logging (so a side effect).
map_err
is more about converting an error isn't it?
storage.clone(), | ||
) | ||
}) | ||
.map(|(split, index_config_clone, request_clone, storage)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the double map
? Readability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
readability and compiler issue I had but it's no more relevant.
I mostly made syntactical comments. The functional changes are great and good job on those tests. |
Co-authored-by: Adrien Guillo <adrien@quickwit.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading this PR, I vote for changing
SearchError::InternalError(String)
toSearchError::InternalError(anyhow::Error)
.
We serialize SearchError
to pass it through the network so I'm not sure how to handle it on the serialization side. We could serialize the error as a string but then we're back to square one. Nevertheless, we could at least have real errors on the root because these errors are not serialized.
Right now, I have the feeling that we already need to improve the errors we are handling (SearchError
, NodeSearchError
), I found it a little bit messy.
@@ -168,11 +168,13 @@ impl FastFieldCollectorBuilder { | |||
} | |||
|
|||
pub fn fast_field_to_warm(&self) -> Vec<String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
Glad to see unit tests on thjis part of the code! |
Fixes #483 #484
Bonus:
Not included: